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Abstract 
Transitioning from elementary to middle school is a time of particular vulnerability for students 
with behavior problems. This study examined the effects of class-wide function-related 
intervention teams (CW-FIT) in three middle school classrooms to determine whether this multi- 
tiered intervention could help teachers proactively manage student behavior. With a focus on 
teaching classroom expectations, delivering behavior specific praise, and providing differential 
reinforcement within an interdependent group contingency, CW-FIT is designed to teach 
functional replacement behaviors that support students’ academic engagement. Intervention 
effects were assessed with seventh and eighth grade students from diverse backgrounds. Results, 
evaluated using a single-subject withdrawal (ABAB) design, indicated improved rates of on-task 
behavior at both class-wide and individual student levels, with corresponding increases in 
teacher praise and decreases in teacher reprimands. The positive way in which participants 
viewed CW-FIT implementation and its accompanying effects on student behaviors was 
consistent with earlier findings in elementary schools. Study limitations and areas for future 
research are discussed. 
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Improving Student Behavior in Middle Schools: 
Results of a Classroom Management Intervention 

Students’ problem behaviors top the list of school concerns, with teachers consistently 
ranking disruptive, defiant, aggressive, and related classroom misconduct as a major barrier to 
teaching (Bushaw & Lopez, 2010; Harrison, Vannest, Davis, & Reynolds, 2012; Simonsen, 
Fairbanks, Briesch, Myers, & Sugai, 2008). Despite national awareness of behavior management 
difficulties, 65% of teachers report receiving little or no training to address students’ challenging 
behaviors (Reinke, Stormont, Herman, Puri, & Goel, 2011). Given the strong link between 
school behavior and academic achievement, teachers need empirically-supported tools to manage 
challenging classroom behavior if they are to meet academic goals (McIntosh, Flannery, Sugai, 
Braun, & Cochrane, 2008). 
Middle School Challenges 

Middle school is a time of particular vulnerability for students with problem behavior. 
Transition from elementary to middle school brings the change from having one teacher to 
having six or seven, with the related challenges of adapting to differential expectations 
(Bernstein, 2002). Also many students experience decreases in academic motivation and 
achievement (Young, Caldarella, Richardson, & Young, 2012). For example, Chung, Elias, and 
Schneider (1998) studied 99 students moving from elementary to middle school and found 
increased psychological distress along with decreased academic achievement. Further, Harrison 
and colleagues (2012) found the most common adolescent behavior problems reported by 
teachers included distractibility, hyperactivity, and immature behaviors, which can lead to off- 


task behavior in the classroom. 
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With limited resources and training, many teachers rely on reactive, punitive responses to 
classroom problem behaviors, resulting in 3.8 million school suspensions annually, dramatically 
higher in middle schools (Owen, Wettach, & Hoffman, 2015). School suspensions and 
expulsions disproportionately target youth of ethnic minorities and students with disabilities 
(Skiba, Shure, & Williams, 2011). Teachers who use harsh reprimands report higher levels of 
disruptive student behavior, personal discouragement, and emotional exhaustion than their peers 
(Jennings & Greenberg, 2009). These practices harm students and teachers, while providing less 
effective classroom management than more positive strategies (Reinke, Herman, & Stormont, 
2013). 

Reactive responses to student misbehavior cost teachers and students hundreds of 
instructional hours each year (Muscott, Mann, & LeBrun, 2008): on average, 20 minutes of 
instructional time for each office discipline referral (Scott & Barrett, 2004). Also the 
disengagement that co-occurs when behavior causes conflicts with teachers increases the risk for 
later school dropout (Eccles & Midgley, 1989).When early intervention is not provided, 
misbehaviors frequently become more intense and more resistant (Sprague & Walker, 2000). 
The importance of identifying and implementing effective middle school classroom management 
interventions cannot be overstated. 

Classroom Management Components 

Clear classroom expectations are a cornerstone to effective classroom management 
(Kehle, Bray, Theodore, Jensen, & Clark, 2000; Sailor, Dunlap, Sugai, & Horner, 2013). To 
design clear expectations for classroom behaviors, teachers must identify both desired and 
undesired behaviors; as they reinforce expectations, student engagement in desired behaviors 


will increase (Epstein, Atkins, Cullinan, Kutash, & Weaver, 2008). 
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School-wide positive behavioral interventions and supports (SWPBIS) applies a multi- 
tiered system school wide to efficiently address the needs of all students. SWPBS begins by 
organizing the school environment for effective, efficient and relevant use of research based- 
behavioral interventions (Sugai & Horner, 2009). Teaching clear expectations is a first tier of 
support for all students, with behavior-specific praise recommended for students who meet 
expectations (Teerlink, Caldarella, Anderson, Richardson, & Guzman, 2017). 

When SWPBIS is implemented with fidelity, approximately 80% of students respond to 
Tier | preventative and proactive interventions; 15% of students require a targeted Tier 2 
intervention, and fewer than 5% of students require a more intensive individualized Tier 3 
application (Sailor et al., 2013). Studies implementing SWPBIS practices in classrooms 
demonstrate similar results. SWPBIS has been implemented in elementary schools, but 
secondary schools have been less likely to adopt such practices, particularly at the classroom 
level (Freeman et al., 2016). Further investigation of this work is warranted in middle school 
classes (Sailor et al., 2013). 

Interdependent group contingencies are behavior management strategies in which 
positive reinforcement depends on the behavior of group members (Alberto & Troutman, 2017). 
Over four decades of research on interventions using group contingencies have shown the 
practice to be effective in improving students' on-task behavior (Hayes, 1976; Jenson, 1978; 
Maggin, Johnson, Chafouleas, Ruberto, & Berggren, 2012; Skiba, Casey, & Center, 1985; Stage 
& Quiroz, 1997; Theodore, Bray, & Kehle, 2004; Trevino-Maack, Kamps, & Wills, 2015). Many 
researchers recommend group contingencies because they (a) create little disruption to the 
lesson, (b) simultaneously address multiple behaviors from several students, and (c) require little 


effort from the teacher (Algozzine, Daunic, & Smith, 2010). A systematic review of group 
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contingencies (Maggin et al., 2012) included 27 single-case design studies with findings 
indicating “sufficient rigor, evidence, and replication to label the intervention as evidence-based” 
(p. 625). However, the authors cited gaps in the research base and recommended additional work 
to (a) provide clearer descriptions of students best suited for the intervention, (b) measure the 
fidelity of group contingencies, and (c) explore middle school-specific effects. 

Class-Wide Function-Related Intervention Teams (CW-FIT) 

CW-FIT was originally developed as an elementary classroom management intervention 
including multiple research-based components. It incorporates clear classroom expectations 
reinforced by structured implementation of behavior specific praise within an interdependent 
group contingency (Wills et al., 2010; see also Litow & Pumroy, 1975; Skinner, Cashwell, & 
Dunn, 1996). The Tier 1 teaching component includes positively stated classroom expectations, 
and expectation lessons. After introducing an expectation through a lesson, the teacher begins 
academic instruction with a quick reminder or precorrect of the expectations Lessons include a 
rationale, discussion, student practice, and teacher feedback. In elementary schools these lessons 
focus on following directions, gaining the teacher’s attention appropriately, and ignoring 
inappropriate peer behavior. 

The group contingency component includes (a) dividing the class into teams based upon 
seating or instructional arrangements (Naylor, Kamps, & Wills, 2018), (b) using a unique class 
reward menu to support differential reinforcement in an interdependent-group contingency 
(Wills, Wehby, Caldarella, Kamps, & Romine, 2018), and (c) providing students with positive, 
constructive teacher feedback (behavior-specific praise) to recognize and reward desired 


behavior and eliminate potential reinforcement for problem behaviors (Wills, Kamps, Caldarella, 
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Wehby, & Swinburne Romine, 2018). Teachers set a timer at intervals to remind them to give 
feedback and score points (Kamps, Conklin, & Wills, 2015). 

CW-FIT has demonstrated effectiveness in elementary schools. A study by Wills et al. 
(2010) implemented Tier 1 and Tier 2 of CW-FIT in more than 35 elementary classrooms with 
over 700 students, improving students’ on-task behavior on average 21.67%. Students identified 
as at risk for emotional or behavioral disorders (EBD) demonstrated a nearly 50% reduction in 
disruptive behaviors. Most teachers in the study found that implementing CW-FIT helped them 
stay positive and that the intervention protected teaching time by increasing student engagement, 
decreasing student disruptions, and avoiding reactive or punitive strategies such as office 
referrals. Also 85% of students reported they enjoyed CW-FIT, their teacher was more positive, 
and they liked earning rewards as a team. The program received high social validity from 
teachers and students along with strong administrative support (Kamps et al., 2015; Wills et al., 
2010). Teachers were also able to implement with high fidelity (with 85% or above as 
benchmark; Kamps et al., 2011). 

In another study, Wills, Iwaszuk, Kamps, and Shumate (2014) replicated CW-FIT three 
times each day under various academic settings in a first-grade classroom in a school that had 
adopted SWPBIS three years prior. Students’ on-task behavior at baseline averaged 60% across 
the three class applications, increasing to average 94% after implementation. Three target 
students’ on-task behavior also increased significantly. The teachers’ praise doubled during CW- 
FIT implementation, and their reprimands decreased significantly between baseline and 
intervention phases. 


Research Purpose 
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While CW-FIT Tier 1 implementation has been effective in elementary schools, research 
has not yet examined its effectiveness in middle school classrooms. The particular challenges 
and related behavior problems for middle school students warrant exploring the effects of CW- 
FIT in middle school contexts. This is the first study to do this, addressing improvements in 
classrooms as well as in outcomes for individual students identified as at-risk (CW-FIT MS). 

Five questions guided this research: 

1. Can middle school teachers implement CW-FIT MS with fidelity? 
2. How does CW-FIT MS impact teacher praise and reprimand frequencies? 
3. How does CW-FIT MS impact students’ on-task behavior at the classroom level? 
4. How does CW-FIT MS impact the on-task behavior of individual students 
nominated by their teacher based on off-task and disruptive behavior? 
5. Do teachers and students find CW-FIT MS to be a socially valid intervention to 
address off-task behavior? 
Method 
Participants and Settings 

After informed consent had been obtained, this study was conducted in one classroom at 
each of three middle schools—all Title 1 schools that had been implementing SWPBIS at various 
levels with established school expectations and reward/recognition systems. Two at-risk students 
were targeted in each of these classes for a total of six. Classes ranged from 20 to 28 students. 
Class 1 in School | was a 7th grade class in a public school in an urban Western U.S. city, 
serving 845 students, 65% of whom qualified for free or reduced-price lunch. The majority of 
students identified as Caucasian (54.8%) or Hispanic (37.5%). In their third year of SWPBIS 


implementation, School | did not have a formal assessment available, yet efforts were evident, 
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with expectations posted throughout the school; a team designated to review data, routines and 
procedures for teaching; and a school-wide reward system. 

Class 2 in School 2 was an 8" grade class in a public school in an urban Midwestern U.S. 
city, serving over 812 students, 85.1% of whom qualified for a free or reduced-price lunch. A 
majority of students identified as Caucasian (55.4%); ethnic minority groups identified as 
African American (16.6%), Hispanic (15.0%), and Asian (5.3%). In its fifth year of 
implementing SWPBIS, School 2 had a recent overall score of 89% on its School-wide 
Evaluation Tool (www.pbis.org). Its Self-Assessment Survey (www.pbis.org) showed 80% of 
items in place. 

At School 3 the study was conducted in a seventh grade classroom in a city public school 
is urban Midwestern U.S., serving over 648 students, 83.6% of whom qualified for free and 
reduced-price lunch. A majority of students identified as Caucasian (49.4%), with others 
identifying as Hispanic (18.1%), African American (15.3%), and Asian (5.6%). School 3 had 
received state recognition for excellence in SWPBIS implementation and had a recent Tiered 
Fidelity Inventory (www.pbis.org) score of 96% and Self-Assessment Survey ranking of 90%. 

The three participating teachers were all female; Teacher 1 was Hispanic; Teachers 2 and 
3 were Caucasian. Teacher 1 (School 1) had over 29 years of teaching experience, all at the same 
school. For the study she selected her last science class of the day, due to off-task and disruptive 
behavior. Teacher 2 (School 2) had taught for over 21 years, the last three at the school where the 
study was conducted. She selected a mid-day math class in which students were frequently off 
task. Teacher 3 was had been at School 3 for all of her six years of teaching. She selected the last 


science class of her day because she noted the students had difficulty focusing. Prior to this study 
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the classroom teachers had managed problem student behavior using a school currency system 
(tickets), verbal reprimands, redirection, loss of privileges, and office discipline referrals. 

Target students were nominated by their teacher as at risk for off-task disruptive behavior 
according to the Systematic Screening for Behavior Disorders (SSBD; Walker & Severson, 
1992). The SSBD's standardized norm-based multiple-gating assessment procedure includes 
three stages: (a) teacher screening and ranking of all students in the classroom for internalizing 
or externalizing behavior criteria, (b) teacher rating of three students most severe on critical 
events and maladaptive behavior, and (c) direct observation of students exceeding the normative 
criteria on the standardized teachers’ rating. Using SSBD Stage 1, teachers ranked students on 
externalizing classroom behaviors, and informed parental consent was obtained for them to 
participate in this study. Stage 2 of the SSBD was not used, as it had not been normed with 
middle school students. Direct observations were conducted to confirm that students displayed 
low levels of on-task behavior (below 70%) in a 10-minute observation. Teacher | identified two 
seventh-grade students: Student 1, a 12-year-old Hispanic female, and Student 2, a 12-year-old 
Hispanic male. Teacher 2's selection was two eighth-grade students: Student 3, a 13-year-old 
African American male, and Student 4, a 13-year-old Middle Eastern male. Teacher 3 targeted 
two seventh-grade students: Student 5, a 13-year-old Hispanic male, and Student 6, a 13-year-old 
Hispanic female. 

Data Collection Procedures 

Baseline and intervention sessions consisted of one to two 10-minute observations per 

day, depending on instructional activity and class period length. Observations were only 


collected when the teacher was instructing. If the teacher lectured for part of the class period and 
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then encouraged independent work in two distinct formats for the remainder, two points of data 
were collected. Class periods varied from 56 to 90 minutes. 

Data were collected for (a) on-task behavior at the classroom level, (b) teacher praise and 
reprimands, and (c) on-task behavior of the target students identified with challenging behavior. 
The primary dependent variable was the on-task behavior at the classroom level measured with 
the group on-task observation form. On-task behavior was defined as students being within the 
area of instruction, complying with instructions for academic tasks, attending to the teacher 
and/or appropriate materials, asking and answering questions, reading and/or writing. Teacher 
praise was defined as a verbal statement indicating approval of behavior beyond an evaluation of 
adequacy or acknowledgement of a correct response to a question (e.g. “I appreciate that Mark 
opened his science book when asked and waited for further instructions.”) Teacher reprimand 
was defined as verbally scolding or negatively commenting about behavior, often with the intent 
to stop misbehavior. This included statements or threats of negative consequences (e.g. “Table 3 
needs to stop talking or they will lose end-of-class free time.” Teacher praise and reprimands 
could be made to an individual student or a group. 

Classroom level group on-task behavior was measured using a momentary time sampling 
measure with paper and pencil. Observers established student groups of three to six based on 
proximity, such as a row or cluster of desks. Class 1 included five groups, Class 2 had seven 
groups, and Class 3 worked in six groups. The groups remained consistent throughout the study. 
On-task behavior and observations were recorded every 30 seconds for a 10-minute period. An 
on-task score was awarded when every student in a group was on task. Every 30 seconds the 
observer scanned each group and recorded a + if all students in the group were on task and a — if 


any student in the group was off task. The scan consistently progressed in the same sequence 
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(e.g., Group 1, Group 2, and so on). Observers would look up at the group, record the + or — and 
then proceed to the next group. Ifa reliability observer was present, the group was quietly 
announced (e.g. “Group 1”) and then each observer recorded the result before the next group was 
announced. 

After recording on-task data for groups in a classroom, observers recorded the on-task 
behavior of the two individual target students, who were not in the same group. These data 
followed the same momentary time-sampling procedure of recording every 30 seconds for 10 
minutes. While these students were first recorded as part of a group, they were recorded 
individually, usually 15 to 20 seconds after they were recorded as part of their group. A student 
could have been off task during the recording of his group’s behavior yet on task at the moment 
he was individually recorded. 

At the end of each 10-minute observation, on-task behaviors were averaged as a 
percentage on task per 10-minute period (Kamps, Conklin, & Wills, 2015). Each group had a 
percentage of intervals recorded as on task. The percentages were then averaged for an overall 
classroom average. Target students had an individual on-task percentage for the 10-minute 
period simply calculated as the total number of on-task intervals (+) divided by 20 (the total 
number of intervals). 

Throughout the 10-minute on-task observation, the observers (research assistants and 
graduate research assistants) recorded the frequency of the primary teacher’s praise and 
reprimands. Each praise and reprimand statement from the teacher was tallied, whether it was 
directed to an individual, a group, or the entire class. 

Social Validity. A social validity survey was given to the teachers and their students 


immediately following intervention. The teacher survey included seven items: five with a 4-point 
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Likert-type scale, and two as open-ended questions. The Likert-type scale consisted of four 
options: 1 = not true, 2 = somewhat true, 3 = mostly true, 4 = very true. The open-ended 
questions asked the teachers what they considered most helpful in learning to implement the 
CW-FIT Middle School program and how they would suggest modifying the program for future 
use. Scores were averaged across Likert type items for a total score out of 4.0, with higher scores 
indicating more positive ratings. The student survey included four items: two with a yes or no 
response option and two open-ended questions asking what the students liked most about CW- 
FIT MS and what, if anything, they did not like about it. 

Interobserver agreement. Before the study all data collectors were trained by taking 
data with the paper-pencil observation techniques in other classrooms until reaching the criterion 
of 85% reliability across three sessions. Interobserver agreement was collected on 29% of all 
paper-pencil observations during baseline, intervention, and withdrawal conditions. A second 
individual (a graduate research assistant) collected the interobserver agreement data. Across all 
conditions interobserver agreement was 94% (range 90%-100%) for class on-task behavior, 
teacher praise, and teacher reprimand. Interobserver agreement for class on-task behavior 
averaged 93% during baseline (range 91%-99%), 97% during intervention (range 94%-100%), 
and 90% during reversal (range 85%-93%). Interobserver agreement for teacher praise averaged 
87% during baseline (range 0%-100%), 95% during intervention (range 0%-100%), and 84% 
during reversal (range 0%-100%). Interobserver agreement for teacher reprimand averaged 80% 
during baseline (range 0%-100%), 99% during intervention (range 88%-100%) and 94% during 
reversal (range 50%-100%). On a few occasions an observer recorded a single BSP or reprimand 
statement that the second observer did not record, resulting in an IOA score of 0% for that data 


session. Interobserver agreement for target students’ on-task behavior was 95% during baseline 
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(range 90%-100%), 97% during intervention (range 90%-100%), and 89% during reversal (range 
80%-97%). 

Intervention fidelity. Fidelity of the CW-FIT MS Tier 1 intervention was collected on 
100% of the sessions throughout baseline and intervention periods. A nine-item fidelity form was 
completed by the observer at the end of each baseline and CW-FIT MS observation. Items were 
recorded as not present (NP) or ranked on a Likert scale from 1 to 3 to indicate quality. A 
percentage was calculated with 27 possible points (nine questions with 3 points possible). 
Teachers were expected to implement CW-FIT MS with 85% fidelity (Kamps et al., 2011), 
including procedures such as posting the point goal and reward, setting the timer at appropriate 
intervals, and achieving a praise to reprimand ratio of 4:1 (a ratio shown to result in positive 
student behavior [Trussell, 2008)]). Observers were in the classroom for the entire period to 
calculate all aspects of fidelity, including points tallied and rewards delivered. Interobserver 
agreement for procedural fidelity averaged 98% (range 89%-100%). 
Design 

To evaluate the effects of the CW-FIT MS Tier 1 intervention, an ABAB withdrawal 
design (Kazdin, 2011) was used, including baseline, classroom intervention, withdrawal, and a 
final period of CW-FIT MS. All phase change decision rules were based on the primary 
dependent variable of class on-task behavior, with a rule of five minimum data points per 
condition, although target students who were absent or suspended on data collection days had 
fewer than five data points. Additional data were collected if analysis revealed trending or 
variable class on-task data. 


CW-FIT MS Intervention 
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After five observations of baseline, CW-FIT MS was implemented in the classrooms. 
Teachers were trained on the CW-FIT MS protocol in two 30-minute sessions or one hour-long 
session, depending on the teacher’s availability, at the beginning of spring semester. Training 
consisted of showing video clips of CW-FIT MS in middle school classrooms as well as 
familiarizing the teachers with a procedural fidelity form offering information on how to pre- 
correct behavior, offer behavior specific praise/corrections, and reward students. Research 
assistants and graduate research assistants coached teachers in reaching intervention fidelity and 
remained in the classroom to provide feedback after the first three intervention periods. Teachers 
implemented the intervention for three months. 

Intervention Procedures 

CW-FIT in middle schools (CW-FIT MS) was revised from the original elementary 
version to fit the context of middle schools: (a) lesson structure was revised for more active 
student participation, (b) only two lessons were taught due to limited time available (c) the 
primary lesson taught was on respect, a topic consistent with most SWPBIS expectations, (d) 
longer intervals were used with fewer points and less praise (timer set at 5 minutes rather than 3- 
5), (e) teacher training was abbreviated and coaches did not provide in-class modeling, and (f) 
teachers received brief feedback at the end of class periods to fit the hurried time between 
classes. 

Following baseline and the initial training session(s), the teachers taught 10-minute 
lessons on two primary CW-FIT MS classroom expectations: a lesson on respect and a lesson of 
the teacher’s choice. On the first day all teachers delivered a respect lesson; on the second day 
Teacher | and Teacher 3 chose to teach following directions, and Teacher 2 choose to review the 


respect lesson. The class worked together to define respect (or follow directions) for their 
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specific classroom. Within their groups students brainstormed ideas for respectful behavior in the 
classroom, which the teacher compiled on a large sheet. The teacher worked with the students to 
condense the ideas into themes so a final bulleted list could be created. Each skill was broken 
down into steps to show behavioral expectations. The class discussed the rationale for each skill, 
including its fit with school-wide expectations. Each skill was incompatible with the problem 
behaviors reported by the teacher: being disrespectful to peers or teacher, talking too loudly, 
yelling out answers, ignoring directions, becoming distracted by peers, calling out for teacher 
attention, arguing, engaging in noisy transitions, and making disruptive noises. The expectations 
were considered reasonable and relevant for middle schoolers, as the students had helped create 
them. All expectations were posted where students could see them. 

To begin implementing CW-FIT MS, teachers emphasized the expectation lessons taught 
during the first two days (respect and teacher’s choice) and explained to the class that they would 
be rewarding students for following the classroom expectations. Each day the intervention was 
implemented for the entire class period. Every class period began with a brief precorrect, 
reviewing the expected classroom behaviors and reminding students that demonstrating these 
behaviors would help them earn points. The teachers assigned students to teams based on 
groupings of desks and explained that a timer would sound every five minutes. At this signal the 
teacher provided behavior-specific praise and awarded points to groups that were on task at that 
moment. A group with every member on task would receive a point. Teachers used specific 
praise referring to classroom expectations to describe the on-task behaviors earning points. A 
group that did not earn points was provided with behavior-specific feedback. The points were 
tallied on an 11x17-inch point chart at the front of the room, which included the points for each 


team, the goal for the day, and the reward for the day. Points were tallied at the end of the period, 
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and groups were rewarded for meeting the point goal set at the beginning of class. Rewards had 
been announced at the beginning of class, selected from a list of options previously created with 
class input. 

A typical intervention session involves seven steps: (a) the teacher precorrects or prompts 
expectations, (b) she announces the day's point goal, (c) instruction begins, (d) the teacher sets 
the timer for 5-minute intervals, (e) she provides feedback/points contingent on behavior during 
the interval, (f) she tallies points at end of the class period, and (g) teams meeting the point goal 
receive a reward. The teacher calculates the point goal approximating 80% of number of 
intervals possible (see Nelson et al., 2018). For example, during a 50-minute class with the timer 
set for 5-minute intervals, 10 opportunities for points would be available, and the goal would be 
set for 8 points per team. 

Results 
Procedural Fidelity of Middle School Teachers 

Table 1 displays the teachers’ procedural fidelity per phase of the study. Overall fidelity 
with the intervention averaged 91.3% across the 66 periods, ranging from 77% to 100%. High 
fidelity was defined as 85% or above. Teacher 1's on-task percentage was 0.7% in baseline, 95% 
during intervention, 8.88% in reversal, and 97% in the last phase of intervention. Teacher 2 went 
from an on-task percentage of 0% in baseline to 97% during intervention, 40% in reversal, and 
85% in the last phase. Teacher 3 had an on-task percentage of 0% in baseline, 79.1% during 
intervention, and 0% in reversal, but ended up with 95.4% in the last phase of intervention. 
Impact of CW-FIT MS on Teacher Praise and Reprimand 

Figure 1 represents the frequency of teachers’ praise and reprimands. During baseline all 


three teachers averaged | praise statement per 10-minute observation. After the intervention was 
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implemented, frequency of praise more than doubled in all three classes, although a noticeable 
separation occurred concerning the desired 4:1 ratio. Teacher 1 had an increase in praise during 
both intervention periods and a decrease in reprimands. She averaged | praise statement per day 
in baseline, 6.6 praise statements during intervention periods, and 1.6 praise statements per day 
in reversal. She averaged 2.4 reprimands per day in baseline, 0.6 reprimands per day during 
intervention periods, and 1.2 reprimands per day in reversal. Teacher 2 increased praise during 
the first intervention period, but increased reprimands during the reversal and last intervention 
phases. She averaged 0.8 praise statements per day in baseline, 2.6 praise statements during 
intervention periods, and 1.6 in reversal. She averaged 1.2 reprimands per day in baseline, 0.9 
reprimands per day during intervention, and 2.6 reprimands per day in reversal. Teacher 3 
increased praise and decreased reprimands during both interventions, but had higher reprimand 
rates during baseline and reversal phases. Teacher 3 averaged 1 praise statement per day in 
baseline, 5.1 praise statements during intervention periods, and 1.2 praise statements per day in 
reversal. Per day she averaged 6 reprimands in baseline, 2.6 reprimands during interventions, and 
10.4 reprimands in reversal. 
Impact of CW-FIT MS on Classroom-Level Student Behavior 

Figure 2 illustrates the percentage of intervals with students on task across all groups in 
all three classrooms. For Class 1, mean improvements were made, although overlapping data and 
an ascending second baseline condition require caution in interpretation of a functional 
relationship between the CW-FIT MS and improvements in classroom on-task behavior. 
Baseline on-task behaviors averaged 52% (range 39%-75%). Introduction of CW-FIT MS 
brought on-task behaviors to an average of 70% (range 51%-93%), and withdrawal of CW-FIT 


MS decreased on-task behaviors to an average of 61% (range 45%-76%). When CW-FIT MS 
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was reinstated, on-task behavior increased to 86% (range 78%-90%). For Class 2, on-task 
behaviors averaged 71% at baseline (range 65%-78%), increased to average 89% (range 81%- 
97%) with introduction of CWFIT MS, but decreased to average 62% (range 56%-80%) at 
withdrawal. After CW-FIT MS was reintroduced, average on-task behaviors increased to average 
84% (range 74%-93%). For Class 3, on-task behaviors averaged 47% at baseline (range 40%- 
53%), increased to 77% (range 51%-96%) during intervention, and decreased to average 29% 
(range 27%-40%) at withdrawal, finally increasing to average of 88% (range 78%-95%) at 
reintroduction. At baseline a low percentage of on-task behaviors was observed for Classes 1 
and 3 and a stable moderate rate for Class 2. 

Introduction of the CW-FIT MS intervention increased on-task behavior for two of the 
three classes; however Class | showed overlapping data and an ascending second baseline, 
requiring cautionary interpretation of the data. A discernible mean shift was observed in all three 
classes, with smaller but more stable effects in Class 2 and larger more variable effects in Class 
3. Withdrawal of the CW-FIT MS intervention caused an immediate drop in rates of on-task 
behavior in all three classrooms, with a visible mean shift for Classes 2 and 3. Reintroduction of 
the intervention brought about a clear mean shift for all three classes and stable high rates of on- 
task behavior. A functional relationship between CW-FIT MS and higher rates of on-task 
behavior was supported for Class 2 and Class 3, with weaker support noted for Class 1 due to 
data variability and the increasing trend noted in the return to baseline condition. 

Impact of CW-FIT MS on Target Student Behavior 

Figures 3 and 4 show the on-task behavior of the six target students. With the 

introduction of CW-FIT MS, all six students increased average on-task behavior from baseline 


averages of 49%, 65%, 40%, 50%, 50%, and 57% respectively, to first intervention averages of 
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62%, 84%, 82%, 77%, 64%, and 92%. Although visual analysis of the individual student data 
was not used to determine condition changes (the primary dependent variable being class on 
task), visual analysis of the graphs indicating variability in the data and absence of a clear mean 
shift for Student 1, Student 2, and Student 5 limit confidence in the functional relationship 
between the intervention and rates of on-task behavior for these participants. Conversely, rates 
of on-task behavior after introduction of the intervention and return to baseline do support a 
functional relationship for Student 3, Student 4, and Student 6. 
Social Validity for Teachers and Students 

Table 2 provides the social validity results for each teacher. All three teachers reported 
that they had received adequate training and found the support/feedback from the researchers to 
be helpful. The teachers also affirmed that they found the use of a procedural fidelity worksheet 
to be helpful in learning the intervention. They reported that they will continue to use CW-FIT 
MS moving forward. Teachers also provided feedback through answering open-ended questions. 
Responses to the first open-ended question—‘‘What was most helpful to you in learning how to 
implement the CW-FIT MS program?”—included “Researcher's observations, input and support 
of the class were very valuable to me as a teacher” and “Practice over multiple days, charts 
provided, not drastically different from previous training on behavior management.” Responses 
to the second open-ended question—‘‘How would you modify the CW-FIT MS program for 
future use?”—included “Make the chart bigger, add a monthly reward,” “I am trying to continue 
using CW-FIT MS stretching the time a bit longer,” and “I would only occasionally use a timer.” 

Student satisfaction with the intervention was assessed with all students in the classes 
(N=69) completing an anonymous survey with two items scored yes or no and two open-ended 


items. To the item “I enjoyed CW-FIT MS,” 91% of the students responded with yes. The item 


IMPROVING STUDENT BEHAVIOR IN MIDDLE SCHOOLS 21 


“Do you think CW-FIT MS could help students get more work done in their classrooms?” also 
received 91% yes responses. Responses to the open-ended item asking what they liked most 
about CW-FIT MS fell into four general categories: (a) 46% liked reward/points, (b) 33% felt 
that students focused and learned more, (c) 10% noted the challenge and team effort, and (d) 
11% commented that CW-FIT MS was fun or provided other generic positive responses. The 
second open-ended item, asking what, if anything, they did not like about CW-FIT MS, drew 
responses from 17% of the students, with comments such as “people need a reason to behave,” 


99 66 


“it’s hard to be quiet,” “not all positivity,” as well as mention of problems with competitiveness 
and with students arguing about whether they should get a point. 
Discussion 

This study evaluated student and teacher responses to implementation of CW-FIT MS in 
Title 1 seventh and eighth grade classrooms. Target classes were science and mathematics, 
content areas with limited research on behavioral issues. Results of this study supported previous 
findings, extending results achieved at elementary levels. Findings are discussed in terms of the 
four research questions. 

First, with respect to implementation fidelity, data collected in three classrooms during all 
observation periods yielded an average fidelity score of 91.3% (range 77%-100%). 
Correspondingly high fidelity (85% or above) had been reported in prior studies assessing the 
CW-FIT implementation in elementary classrooms (Caldarella, Williams, Hansen, & Wills, 
2015; Kamps, Wills, Dawson-Bannister, Kottwitz, Hansen, & Fleming, 2015; Wills et al., 2014; 
Wills, Kamps, Fleming, & Hansen, 2016). 

Second, regarding observed changes in teachers’ praise and reprimand frequencies, two 


of the three teachers demonstrated a marked increase in praise statements during the intervention 


IMPROVING STUDENT BEHAVIOR IN MIDDLE SCHOOLS 22 


phases compared to baseline conditions. Considering the concurrent decrease in reprimands, an 
inverse relationship between praise and reprimand frequencies was noted, results comparing 
favorably with the longitudinal effects achieved by Kamps, Wills et al. (2015). The exception 
was Teacher 2’s praise to reprimand ratio during the second intervention phase, with a pattern 
similar to pre-intervention conditions. However, despite the teacher's decrease in praise 
statements and increase in reprimands, students' on-task percentages remained high. 

Third, although on-task behavior during CW-FIT MS sessions varied across classrooms, 
the average improvement was greater than 20%, similar to findings achieved earlier in 
elementary schools (Kamps et al., 2011; Kamps, Wills et al., 2015). Data for classroom 1 
showed an upward trend during baseline before intervention began, making the results less 
convincing for this class. The improvements for individual students nominated as particularly at- 
risk for externalizing behaviors was also similar to prior studies of CW-FIT in elementary 
schools (Weeden, Wills, Kottwitz, & Kamps, 2016; Wills et al., 2016). All of these students 
improved their on-task behavior, with average improvements ranging from 13% to 42%, 
although functional changes in Results for Students 1, 2, 4, and 5 were less compelling based on 
variability, data overlap, and, for Student 4, delay in effect upon condition changes along with 
the downward trend of the final intervention data. These four students might have benefited from 
Tier 2 supports, which were not the focus of the present study. 

Finally, concerning social validity, results of the 5-item rating scale were positive, 
indicating that all teachers responded very true or mostly true to all five questions. These data 
and responses to the open-ended questions suggested that participating teachers viewed the 
implementation of CW-FIT MS as a positive experience, which they would consider repeating. 


Student responses were also positive, with over 90% indicating they liked participating in CW- 


IMPROVING STUDENT BEHAVIOR IN MIDDLE SCHOOLS 23 


FIT MS and thought the intervention helped them complete their assignments. These results 
corroborate findings of earlier studies conducted in elementary schools (Caldarella et al., 2015; 
Kamps, Wills et al., 2015; Wills et al., 2016). For example, Kamps, Wills et al. (2015) noted that 
teacher participants liked the training, rated the intervention highly acceptable, and found it 
helpful in improving student behavior. 
Implications for Practice 

Because problem student behaviors continue to rank among the most critical concerns 
for teachers, practicing educators need training in approaches to effective proactive classroom 
management. Prior studies suggest that the middle school years (Grades 5-8) involve increased 
vulnerability for students who manifest challenging behavior with concurrent decline in 
academic performance. CW-FIT MS is based on the well-validated and highly effective 
elementary CW-FIT program, and preliminary research suggests that middle school teachers 
are able to implement it with fidelity after minimal training. As Wills et al. (2010) explained, 
the intervention consists of multiple research-based components, including a teaching 
component and a group reinforcement contingency at Tier 1. Preliminary results from the 
present study suggest that CW-FIT MS Tier | shows promise for replicating the effects 
achieved in the elementary school studies: increased student on-task behavior, increased 
teacher praise rates, decreased teacher reprimand rates, high consumer satisfaction ratings, and 
positive implementation fidelity. 
Limitations and Areas for Future Research 

Although results of the study appear promising, current findings are preliminary and 
should be interpreted cautiously as an initial attempt to examine CW-FIT implementation at the 


middle school level. The population consisted of only three teachers and their classes, although 
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their schools were in diverse geographical locations. Additionally, all of the target students 
nominated by their teachers were from ethnic minorities, though most students in these classes 
were Caucasian, suggesting the possibility of bias in screening and identification. A related issue 
is that all teachers in the study had over 15 years of experience and were considered veteran 
teachers (Adjei-Boateng & Amapdu, 2018), which may have impacted the results since novice 
and veteran teachers respond differently to teaching challenges. Future studies including novice 
teachers, as well as a more diverse sampling of students identified with externalizing and off-task 
behaviors, would help increase generalizability of the study findings, as replication of single 
subject studies in multiple contexts is tantamount to strengthening generalizability (Horner et al., 
2005). 

Since CW-FIT MS is specifically designed to increase student engagement and maximize 
instructional time, a second limiting factor of this study is absence of academic assessment. 
Gathering information such as student grades, number of assignments completed, and subject 
area test scores would have enabled a more comprehensive view of participant performance to 
further validate CW-FIT MS implementation in middle schools. 

A third limitation of the study relates to the individual target student data which, although 
encouraging in terms of improved behavioral performance during CW-FIT MS, also discloses a 
high level of variability across students in baseline and intervention phases. We have yet to learn 
causes of these differences in individual student performance. Gathering additional demographic, 
archival, and interview data in future studies of CW-FIT MS might help in identifying 
underlying reasons for students’ differential responses to the intervention and to its specific 


components. Conducting a functional behavior analysis to determine the purpose of students' off- 
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task or disruptive behavior, including provocation and reinforcement, would likely prove helpful 
(Sugai et al., 2000). 

A fourth study limitation results from the restricted scope of applying the intervention in 
the three classes. Although CW-FIT MS is intended to be a multi-tiered positive behavior 
support intervention with enhancements such as self-management and functional assessment 
offered at Tiers 2 and 3 (Kamps, Conklin, & Wills, 2015), the present research examined only 
Tier 1. Assessing Tier 2 interventions would have been logistically challenging. Nevertheless, 
current data suggest that target students’ percentages of on-task behavior reached criterion levels 
without supplemental interventions. Past studies conducted in elementary school classrooms 
have shown the Tier 1 component of CW-FIT MS to effectively decrease disruptive behaviors, 
increase on-task behavior, and improve teachers' praise and reprimand frequencies (Kamps, 
Wills et al., 2015; Weeden et al., 2016; Wills et al., 2014; Wills et al., 2010). 

Considering these limitations, future research is needed to further validate and extend 
current findings. The use of a multi-tiered model to improve behavioral and academic outcomes 
for all students has been previously recommended (e.g., Wills et al., 2016). Yet due to the middle 
school environment and the age group characteristics, many questions remain regarding CW-FIT 
MS. 

Conclusion 

Although preliminary, results of the present study are consistent with earlier findings 
documenting the efficacy of using Tier | of the class-wide CW-FIT intervention in elementary 
schools with a variety of populations, age-groups, and subject areas (Caldarella, Williams, 
Jolstead, & Wills, 2017; Caldarella et al., 2015; Hansen, Caldarella, Williams, & Wills, 2017; 


Kamps, Wills et al., 2015; Weeden et al., 2016). Current findings provide further evidence of 
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CW-FIT generalizability, suggesting that its middle school adaptation can significantly improve 
behavioral outcomes and learning opportunities for these older students, particularly in math and 


science general education classes. 
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Figure I. Frequency of teachers’ praise and reprimands across 3 classrooms 
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Figure 3. Percentage of intervals on task for three individual target students 
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Figure 4. Percentage of intervals on task for 3 individual target students 
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Table 1 
Procedural Fidelity Percentages: Means, Standard Deviations 
Baseline Intervention Reversal Intervention 
Teacher M SD M SD M SD M SD 
1 0.7 is) 95.0 el 8.88 6.9 97.0 6.0 
2 0.0 0.0 97.0 4.3 40.0 28.8 85.0 3.6 
3 0.0 0.0 22a 16.9 0.0 0.0 95.4 2.8 
Note. “High fidelity” defined as 85% or above. 
Table 2 
Social Validity 
CW-FIT easy Received Procedural Support/ Will continue 
to learn/ adequate fidelity training feedback using 
Teacher implement training effective helpful CW-FIT 
1 4 4 4 4 3 
2 4 4 4 4 3 
2 3 4 4 4 3 


Note. 4-point Likert Scale (1 = Not True to 4 = Very True). 


